Episode iterator upgrades #216

erikwijmans · 2019-09-28T16:42:47Z

Motivation and Context

Add some upgrades to the episode iterator functionality. The primary thing this adds is the ability to shuffle scenes after some number of steps are taken. If you are shuffling after some number of episodes have ended, you end up shuffling too often at the start of training and not enough towards the end. This value is set to 10k by default. I consider this to be a "high but sane" default value. Not so low that it will dramatically slow-down training, but not so high that scenes will never be swapped.

Also randomizes the shuffle switch values to be within [0.8, 1.2] of the configured value. This helps not have all workers swap the same point for the shuffle by steps mode.

Adds an initial shuffle to the list of episodes as otherwise the same scene will always be initially loaded (which is bad).

Sets the max steps per scene to a high but sane default value. We generated a huge number of episodes per scene for training (50k IIRC) and you basically never switch scenes toward the end of training with that.

Also makes the scene order get shuffled by default as this makes sense for the default (people rarely change the defaults, so they should be set reasonably).

How Has This Been Tested

Via the tests

Types of changes

New feature (non-breaking change which adds functionality)

…ch/habitat-api into episode-iterator-upgrades

mathfac · 2019-10-01T17:27:00Z

@JasonJiazhiZhang feel free to comment on the PR.

mathfac

Thank you for bringing this to master. Checked the logic of EpisodeIterator it's not so easy to follow. Maybe in future we would like it to have separate easier to understand subclasses of EpisodeIterator.
Requested some changes.

mathfac · 2019-10-01T19:20:40Z

habitat/core/dataset.py

            self._rep_count = 0
+            self._step_count = 0
+
+        self._switch_scene_if()


next_episode is not updated based on results of self._switch_scene_if().

mathfac · 2019-10-01T19:28:12Z

habitat/core/dataset.py

+    def _set_shuffle_intervals(self):
+        if self.max_scene_repetition_episodes > 0:
+            self._max_rep_episode = random.randint(
+                int(0.8 * self.max_scene_repetition_episodes),


Let's create repetition_rand_interval = 0.2 param for EpisodeIterator and do:

self._max_rep_step = random.randint( int((1 - self.repetition_rand_interval) * self.max_scene_repetition_steps), int((1 + self.repetition_rand_interval) * self.max_scene_repetition_steps), )

Or even create util method:

@static def _randomize_value(value, interval): return random.randint( int((1 - interval) * value), int((1 + interval) * value), )

mathfac · 2019-10-01T19:30:48Z

habitat/core/env.py

@@ -213,6 +213,11 @@ def _update_step_stats(self) -> None:
        if self._past_limit():
            self._episode_over = True

+        if self.episode_iterator is not None and isinstance(


While self.episode_iterator mentioned as Optional I don't see Env functioning without it here. Maybe check for subclass of EpisodeIterator.

A separate PR probably makes sense for changing that functionality (dataset is also marked as optional).

mathfac · 2019-10-01T19:32:16Z

habitat/core/dataset.py

+            do_switch = True
+
+        if do_switch:
+            self._shuffle_iterator()


Should we check for self.shuffle == True, here when we shuffling.

No. _shuffle_iterator is used both to initiate a scene switch and to just the shuffle the episodes. self.shuffle decides whether or not to shuffle the episode order on cycle or on load.

That's strange as there was no option to enable cycling without episode shuffling.

The cycle logic doesn't rely on the shuffle

… episode-iterator-upgrades

…ch/habitat-api into episode-iterator-upgrades

mathfac · 2019-10-01T23:18:28Z

habitat/core/dataset.py

        :param num_episode_sample: number of episodes to be sampled. :py:`-1`
            for no sampling.
        """
+        self._repetition_rand_interval = 0.2


Let's move repetition_rand_interval to init argument with default value, otherwise no option to turn it off.

mathfac

Great job on following up the comments. Last item:
We are getting shuffle automatically if max_counts start working.
Maybe more logical will be to jump to next scene episodes without shuffling if shuffle isn't enabled.

JasonJiazhiZhang

Thank you @mathfac for looping me in and thank you @erikwijmans for implementing this! I have some minor comments. Overall looks great to me!

habitat/core/dataset.py

JasonJiazhiZhang · 2019-10-02T00:51:00Z

habitat/core/dataset.py

+        else:
+            self._max_rep_step = None
+
+    def _switch_scene_if(self):


Question here: would it be switching to frequently at the later stage of training? with potentially less than 100 episode per scene switch? Will it help to incorporate both count schemes, like a logical AND?

Shuffling on steps is consistent from an optimization standpoint, you swap scenes every N parameters updates, which is why I like it :)

I don't think you can switch scenes "too often". Ideally, we'd just randomly sample a new episode irrespective of the scene it is in, but this incurs the non-trivial cost of swapping the scene way too often.

habitat/core/dataset.py

… comments to assert messages.

mathfac · 2019-10-09T08:51:37Z

Pulled PR to understand the logic of changes. As result added some tests and small fix.

erikwijmans · 2019-10-09T15:31:38Z

habitat/core/dataset.py

@@ -376,9 +377,6 @@ def _switch_scene(self) -> None:

        if len(grouped_episodes) > 1:
            # Ensure we swap by moving the current group to the end
-            if self.shuffle:


Why was this shuffle deleted?

Because, shuffles are already happening at the beginning and on each cycle. That's why this shuffle is redundant and can be omitted with no test failing.

Yeah, I guess both logics are valid. Either you cycle through all scenes in a predetermined but random order, or you randomly draw a new scene from the set of other scenes. One doesn't seem inherently better than the other.

My point is that it's already in predetermined but random order because of self.shuffle in other places. So there is no need to shuffle again.

Yep yep, it's just different ideas of how scenes should be selected on a swap. I am fine with this way if it makes more sense to you, both make sense to me.

* Episode iterator upgrades * Make default slightly lower * Fix env.py * fix is true * Update docstring * Fix formatting * Turn off shuffle! * Changes * Fix test and incorperate comments * Don't always shuffle on scene swap * Rename and add doc string * Cleanup * Simplify shuffle logic * Fixed small bug with initial rep_count value, added more tests, moved comments to assert messages.

erikwijmans added 2 commits September 28, 2019 12:36

Episode iterator upgrades

ebaa6ec

Make default slightly lower

3b411d0

erikwijmans requested review from mathfac and abhiskk September 28, 2019 16:42

facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Sep 28, 2019

erikwijmans and others added 7 commits September 28, 2019 13:45

Fix env.py

c5de063

fix is true

bb06827

Merge branch 'episode-iterator-upgrades' of github.com:facebookresear…

69cf09a

…ch/habitat-api into episode-iterator-upgrades

Update docstring

0256a4b

Merge branch 'episode-iterator-upgrades' of github.com:facebookresear…

b1383d0

…ch/habitat-api into episode-iterator-upgrades

Fix formatting

cccfc9b

Turn off shuffle!

9cba332

mathfac requested changes Oct 1, 2019

View reviewed changes

erikwijmans added 3 commits October 1, 2019 16:46

Merge branch 'master' of github.com:facebookresearch/habitat-api into…

19b9d2a

… episode-iterator-upgrades

Changes

01847cd

Merge branch 'episode-iterator-upgrades' of github.com:facebookresear…

b892f65

…ch/habitat-api into episode-iterator-upgrades

mathfac reviewed Oct 1, 2019

View reviewed changes

Fix test and incorperate comments

96d1375

mathfac approved these changes Oct 2, 2019

View reviewed changes

Don't always shuffle on scene swap

e43bdc5

JasonJiazhiZhang reviewed Oct 2, 2019

View reviewed changes

erikwijmans added 2 commits October 1, 2019 21:37

Rename and add doc string

974dbca

Cleanup

8b8309f

mathfac reviewed Oct 2, 2019

View reviewed changes

habitat/core/dataset.py Outdated Show resolved Hide resolved

habitat/core/dataset.py Show resolved Hide resolved

Simplify shuffle logic

fb54383

mathfac reviewed Oct 4, 2019

View reviewed changes

habitat/core/dataset.py Show resolved Hide resolved

Fixed small bug with initial rep_count value, added more tests, moved…

ebf8358

… comments to assert messages.

erikwijmans commented Oct 9, 2019

View reviewed changes

erikwijmans merged commit 600a1ef into master Oct 9, 2019

erikwijmans deleted the episode-iterator-upgrades branch October 9, 2019 19:39

mathfac added this to the Better engineering milestone Oct 9, 2019

danmou mentioned this pull request Oct 29, 2019

confusion about habitat-api and habitat-sim version facebookresearch/splitnet#7

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Episode iterator upgrades #216

Episode iterator upgrades #216

erikwijmans commented Sep 28, 2019 •

edited

Loading

mathfac commented Oct 1, 2019

mathfac left a comment

mathfac Oct 1, 2019

mathfac Oct 1, 2019

mathfac Oct 1, 2019

erikwijmans Oct 1, 2019

mathfac Oct 1, 2019

erikwijmans Oct 1, 2019 •

edited

Loading

mathfac Oct 1, 2019

erikwijmans Oct 1, 2019 •

edited

Loading

mathfac Oct 1, 2019

mathfac left a comment

JasonJiazhiZhang left a comment

JasonJiazhiZhang Oct 2, 2019

erikwijmans Oct 2, 2019 •

edited

Loading

mathfac commented Oct 9, 2019

erikwijmans Oct 9, 2019

mathfac Oct 9, 2019

erikwijmans Oct 9, 2019

mathfac Oct 9, 2019

erikwijmans Oct 9, 2019

Episode iterator upgrades #216

Episode iterator upgrades #216

Conversation

erikwijmans commented Sep 28, 2019 • edited Loading

Motivation and Context

How Has This Been Tested

Types of changes

mathfac commented Oct 1, 2019

mathfac left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

erikwijmans Oct 1, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

erikwijmans Oct 1, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mathfac left a comment

Choose a reason for hiding this comment

JasonJiazhiZhang left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

erikwijmans Oct 2, 2019 • edited Loading

Choose a reason for hiding this comment

mathfac commented Oct 9, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

erikwijmans commented Sep 28, 2019 •

edited

Loading

erikwijmans Oct 1, 2019 •

edited

Loading

erikwijmans Oct 1, 2019 •

edited

Loading

erikwijmans Oct 2, 2019 •

edited

Loading